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Abstract —Light detection and ranging systems reconstruct 
scene depth from time-of-flight measurements. For low light- 
level depth imaging applications, such as remote sensing and 
robot vision, these systems use single-photon detectors that 
resolve individual photon arrivals. Even so, they must detect 
a large number of photons to mitigate Poisson shot noise and 
reject anomalous photon detections from background light. We 
introduce a novel framework for accurate depth imaging using a 
small number of detected photons in the presence of an unknown 
amount of background light that may vary spatially. It employs a 
Poisson observation model for the photon detections plus a union- 
of-subspaces constraint on the discrete-time flux from the scene 
at any single pixel. Together, they enable a greedy signal-pursuit 
algorithm to rapidly and simultaneously converge on accurate 
estimates of scene depth and background flux, without any 
assumptions on spatial correlations of the depth or background 
flux. Using experimental single-photon data, we demonstrate 
that our proposed framework recovers depth features with 1.7 
cm absolute error, using 15 photons per image pixel and an 
illumination pulse with 6.7-cm scaled root-mean-square length. 
We also show that our framework outperforms the conventional 
pixelwise log-matched Altering, which is a computationally- 
efflcient approximation to the maximum-likelihood solution, by 
a factor of 6.1 in absolute depth error. 

Index Terms —Computational imaging, LIDAR, single-photon 
imaging, union-of-subspaces, greedy algorithms. 

1. Introduction 

A conventional light detection and ranging (LIDAR) system, 
which uses a pulsed light source and a single-photon detector, 
forms a depth image pixelwise using the histograms of photon 
detection times. The acquisition times for such systems are 
made long enough to detect hundreds of photons per pixel 
for the hnely binned histograms these systems require to 
do accurate depth estimation. In this letter, we introduce a 
framework for accurate depth imaging using only a small 
number of photon detections per pixel, despite the presence 
of an unknown amount of spatially-varying background light 
in the scene. Our framework uses a Poisson observation 
model for the photon detections plus a union-of-subspaces 
constraint on the scene’s discrete-time flux at any single pixel. 
Using a greedy signal-pursuit algorithm—a modiflcation of 
CoSaMP ||T| —we solve for accurate estimates of scene depth 
and background flux. Our method forms estimates pixelwise 
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and thus avoids assumptions on transverse spatial correlations 
that may hinder the ability to resolve very small features. 
Using experimental single-photon data, we demonstrate that 
our proposed depth imaging framework outperforms log- 
matched Altering, which is the maximum-likelihood (ML) 
depth estimator given zero background light. 

Because our proposed framework is photon-efficient while 
using an estimator that is pixelwise and without background 
calibration, it can be useful for dynamic low light-level 
imaging applications, such as environmental mapping using 
unmanned aerial vehicles. 

A. Prior Art 

The conventional LIDAR technique of estimating depth 
using histograms of photon detections is accurate when the 
number of photon detections is high. In the low photon-count 
regime, the depth solution is noisy due to shot noise. It has 
been shown that image denoising methods, such as wavelet 
thresholding, can boost the performance of scene depth re¬ 
covery in the presence of background noise Also, using 
an imaging model that incorporates occlusion constraints was 
proposed to recover an accurate depth map However, these 
denoising algorithms implicitly assume that the observations 
are Gaussian distributed. Thus, at low photon-counts, where 
depth estimates are highly non-Gaussian their performance 
degrades significantly Q. 

First-photon imaging (FPI) is a framework that allows 
high-accuracy imaging using only the first detected photon 
at every pixel. It demonstrated that centimeter-accurate depth 
recovery is possible by combining the non-Gaussian statistics 
of first-photon detection with spatial correlations of natural 
scenes. The FPI framework uses an imaging setup that includes 
a raster-scanning light source and a lensless single-photon 
detector. More recently, photon-efficient imaging frameworks 
that use a detector array setup, in which every pixel has the 
same acquisition time, have also been proposed 0,0, ®- 

We observe two common limitations that exist in the prior 
active imaging frameworks for depth reconstruction. 

• Over-smoothing: Many of the frameworks assume 
spatial smoothness of the scene to mitigate the effect of 
shot noise. In some imaging applications, however, it is 
important to capture fine spatial features that only occupy 
a few image pixels. Using methods that assume spatial 
correlations may lead to erroneously over-smoothed im¬ 
ages that wash out the scene’s fine-scale features. In such 
scenarios, a robust pixelwise imager is preferable. 
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Fig. 1. An illustration of the single-photon imaging setup for one illumination pulse. A pulsed optical source illuminates a scene pixel with photon-flux 
waveform s{t). The flux waveform r(t) that is incident on the detector consists of the pixel return as{t — 2d/c )—where a is the pixel reflectivity, d is the 
pixel depth, and c is light speed—plus the background-light flux b. The rate function X(t) driving the photodetection process equals the sum of the pixel 
return and background flux, scaled by the detector efficiency 77 , plus the detector’s dark-count rate bd- The record of detection times from the pixel return (or 
background light plus dark counts) is shown as blue (or red) spikes, generated by the Poisson process driven by A(f). 


• Calibration: Many imaging methods assume a calibra¬ 
tion step to measure the amount of background flux exist¬ 
ing in the environment. This calibration mitigates bias in 
the depth estimate caused by background-photon or dark- 
count detections, which have high temporal variance. 
In practical imaging scenarios, however, the background 
response varies in time, and continuous calibration may 
not be practical. Furthermore, many methods assume 
background flux does not vary spatially. Thus, a calihra- 
tionless imager that performs simultanous estimation of 
scene parameters and spatially-varying background flux 
from photon detections is useful. 

In this letter, we propose a novel framework for depth 
acquisition that is applied pixelwise and without calibration. 
At each pixel, our imager estimates the background response 
along with scene depth from photon detections. Similar to 0 , 
we use a union-of-subspaces constraint for modeling the scene 
parameters. However, our union-of-subspace constraint is de- 
flned for both the incoherent signal and background waveform 
parameters that generate photon detections; the constraint in 
0 is deflned for only the coherent signal waveform that is 
perturbed by Gaussian noise, not photon noise. 

Using the derived imaging model, we propose a greedy 
signal pursuit algorithm that accurately solves for the scene 
parameters at each pixel. We evaluate the photon efficiency 
of this framework using experimental single-photon data. In 
the presence of strong background light, we show that our 
pixelwise imager gives an absolute depth error that is 6.1 times 
lower than that of the pixelwise log-matched Alter. 

II. Single-Photon Imaging Setup 

Figure. illustrates our imaging setup, for one illumination 
pulse, in which photon detections are made. A focused optical 
source, such as a laser, illuminates a pixel of the scene with the 
pulse waveform s{t) that starts at time 0 and has root-mean- 
square pulsewidth Tp. This illumination is repeated every Tr 
seconds for a sequence of Ng pulses. The single-photon detec¬ 
tor, in conjunction with a time correlator, is used to time stamp 
individual photon detections, relative to the time at which the 


immediately preceding pulse was transmitted. These detection 
times, which are observations of a time-inhomogeneous Pois¬ 
son process, whose rate function combines contributions from 
pixel return, background light, and dark counts, are used to 
estimate scene depth for the illuminated pixel. This pixelwise 
acquisition process is repeated for x Ny image pixels by 
raster scanning the light source in the transverse directions. 

III. Forward Imaging Model 

In this section, we study the relationship between the 
photon detections and the scene parameters. For simplicity of 
exposition and notation, we focus on one pixel; this is repeated 
for each pixel of a raster-scanning or array-detection setup. 

Let a, d, and b be unknown scalar values that represent 
reflectivity, depth, and background flux at the given pixel. 
The reflectivity value includes the effects of radial fall-off, 
view angle, and material properties. Then, after illuminating 
the scene pixel with a single pulse s{t), the backreflected 
waveform that is incident at the single-photon detector is 

r{t) = as{t — 2d/c) tG[ 0 ,T^). ( 1 ) 

A. Photodetection statistics 

Using we observe that the rate function that generates 
the photon detections is 

A(t) = T] {as{t — 2d/c) -f 6 ) + 6 ( 7 , t G [0,T^), ( 2 ) 

where 7 ^ G (0,1] is the quantum efficiency of the detector and 

^ 0 is the dark-count rate of the single-photon detector. 

Let A be the time bin duration of the single-photon detector. 
Then, we deflne M = T^/A to be the total number of time 
bins that capture photon detections. Let y be the vector of size 
M X 1 that contains the number of photon detections at each 
time bin after we illuminate the pixel Ng times with pulse 
waveform s{t). Then, from photodetection theory j^, we can 
derive 

y/c ^ Poisson (Ng f [r]{as{t - 2d/c) + 6 ) + bd] dt \ , 
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for /c = 1,..., M. Note that we have assumed that our total 
pixel wise acquisition time NsTr is short enough that b is 
constant during that time. Let 

Vj = f a6{t — 2d/c) dt, (4) 

niA nje 

Sij = / / Ns-qs{t-y)dtdy, ( 5 ) 

J (i-l)A J (i-l)e 

5 = + (6) 

for i = 1,... M, j = 1,..., A, where e is a small number, 
such that Tr is divisible by e and N = Tr/e. Defining 1 m xi 
to be an M X 1 vector of ones, we can approximate the rate 
function in ^ and rewrite the distribution as 

yfe ~Poisson((Sv + BlMxi)fe), (7) 

for /c = 1,... M. Finally, defining A = [S, 1 m xi] and x = 
[v^, 5]^, we can further rewrite as 

Yk ^ Poisson((Ax)j,). (8) 

So far, we have simplified the pixelwise single-photon obser¬ 
vation model, such that the photon-count vector y G ^ ^ is 
a linear measurement of scene response vector x G 
corrupted by Poisson noise. 


B. Scene parameter constraints 

Using the expression in Q, we observe that 


rje 

= / aS{t — 2d/c) dt 

T(j-l)e 

(9) 

{x:{j — l)e<2x/c<je} (^) : 

(10) 


for j = 1,..., A, where 1a{x) is an indicator function that 
equals 1 if x G A and 0 otherwise. In other words, vector v 
has exactly one nonzero element, and the value and index of 
the nonzero element represents the scene refiectivity and depth 
at an image pixel, respectively. 

We defined our (A + 1) x 1 signal x to be a concatenation 
of V, which is the scene response vector of size N, and B, 
which is the scalar representing background flux. Since v has 
exactly one nonzero entry, x lies in the union of N subspaces 
defined as 

N 

>Sjv = U = 0} , (11) 

k=l 

where each subspace is of dimension 2. 

IV. Solving the Inverse Problem 

Using accurate photodetection statistics and scene con¬ 
straints, we have interpreted the problem of robust single¬ 
photon depth imaging as a noisy linear inverse problem, where 
the signal of interest x lies in the union of subspaces Sn- 


Using the observed photon count histogram y has the 
probability mass function 


M 

/y(y; A,x) = JJ 

k=l 


e 


(Ax), (Ax) 

Yfe! 


Yk 

k 


( 12 ) 


Thus, neglecting terms in the negative log-likelihood function 
that are dependent on y but not on x, we can define the 
objective function 


M 

£(x; A, y) = ^ [(Ax)^. - y* log (Ax)^.]. (13) 

k=l 

This objective can be proved to be convex in x. 

We solve for x by minimizing £(x;A,y) with the con¬ 
straint that X lies in the union of subspaces Sn- Also, because 
photon fiux is a non-negative quantity, the minimization results 
in a more accurate estimate when we include a non-negative 
signal constraint. In summary, the optimization problem that 
we want to solve can be written as 


minimize >C(x;A,y) (14) 

X 

s.t. X G Sn^ 

Xi > 0, i = 1,..., (A -h 1). 

To solve our constrained optimization problem, we propose 
an algorithm that is inspired by existing fast greedy algorithms 
for sparse signal pursuit. CoSaMP |[T| is a greedy algorithm 
that finds a A-sparse approximate solution to a linear inverse 
problem. We modify the CoSaMP algorithm so that we obtain 
for a solution constrained to the union of subspaces Sn, 
instead of a A-sparse one. 


Algorithm 1 Single-photon depth imaging using a union-of- 
subspaces model 

Input: y. A, 6 
Output: x('') 

Initialize x^'^^ 0, u y, 0; 

repeat 

k i — A) “h Ij 

X ^ A^u; 

n ^ supp((xi:iv)[i]) U supp(x^^“^^) U {A + 1}; 

b|o ^ b|oc ^ 0; 

X^^^ ^ To (^[(bi:Ar)^p bAT+l]^^ 
u ^ y - Ax^^^ 
until — x ^^^||2 < S 


Our proposed greedy algorithm is given in Algorithmic We 
define To (x) to be the thresholding operator setting all negative 
entries of x to zero, supp(x) to be the set of indices of nonzero 
elements of x, and X[/j.] to be the vector that approximates x 
with its k largest terms. Also, we take A^ to be a matrix with 
columns of A chosen by the index set S. Finally, we use A^ 
and A^ to denote the transpose and pseudo-inverse of matrix 
A, respectively. 

To solve the constrained optimization problem in ( p^ , our 
algorithm iteratively performs the following steps: 
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(a) Photograph (b) Truth (c) Log-matched filter (d) Proposed (e) Error of (c) (f) Error of (d) 



Fig. 2. Experimental pixelwise depth imaging results using single photon observations. The number of photon detections at every pixel was set to be 15. 
The figure shows the (a) photograph of imaged face, (b) ground-truth depth, (c) depth from log-matched filtering, which is approximately ML and (d) depth 
using our method. Also, (e) and (f) show the absolute depth error maps for ML and our framework, respectively. 


1) gradient descent on £(x; A,y), which is approximated 
by the squared ^2 -norm ||y — AxU^ for computational 
efficiency; 

2) projection of the intermediate estimate onto the closest 
subspace in the union of subspaces Sn', and 

3) projection of the intermediate estimate onto the non¬ 
negative cone, 

until a convergence criterion is satisfied. We define conver¬ 
gence of the solution as || where 6 is 3 . 

small number. 

V. Experimental results 

To validate our imaging framework, we used a dataset 
collected by D. Venkatraman for the First-Photon Imaging 
project Q; this dataset and others are available from |TQ| . The 
experimental setup uses a pulsed laser diode with pulsewidth 
Tp = 270 ps and repetition period Tr = 100 ns. A two-axis 
galvo was used to scan 350 x 350 pixels of a mannequin face 
at a distance of about 4 m. A lensless SPAD detector with 
quantum efficiency 77 = 0.35 was used for detection. The 
background light level was set using an incandescent lamp. 
The original mannequin data from |TQ| had the background 
count rate approximately equal to the signal count rate. Our ex¬ 
periment uses the cropped data showing only the mannequin’s 
face, where the background count rate was approximately 0.1 
of the average signal count rate. Although we used a raster¬ 
scanning setup for our single-photon imaging experiments, 
since our imaging algorithm is applied pixelwise, it can be 
also used for imaging with a floodlight illumination source 
and a single-photon detector array. 

We could compare our imaging method with the ML estima¬ 
tor of scene parameters. Unfortunately, due to nonzero back¬ 
ground flux, ML estimation of a, 6, and d requires minimizing 
a non-convex cost function, leading to a solution without 
an accuracy guarantee. Thus, zero background is assumed 
conventionally such that the ML depth estimate reduces to 
the simple log-matched Alter 0: 

^ML = ^ ( argmax logSfy ) . (15) 

^ ve{l,...,n} J 

We use 0 as the baseline depth estimator that is compared 
with our proposed estimator. 

Figure shows the results of recovering depth of the 
mannequin face using single-photon observations. The kernel 
matrix S was obtained by an offline measurement of the pulse 
shape. Note that this measurement depends only on the source. 


not on properties of the scene. The ground-truth depth, shown 
in Fig. I^b), was generated separately by using background- 
calibrated ML estimation from 200 photons at each pixel. 

In our depth imaging experiment, the number of photon 
detections at each pixel was set to 15. We observe that, due 
to extraneous background photon detections, the log-matched 
Alter estimate in Fig. [^c) (average absolute error = 10.3 cm) 
is corrupted with high-variance noise and the facial features 
of the mannequin are not visible. On the other hand, our 
estimate, shown in Fig. 13d). shows high-accuracy depth 
recovery (average absolute error =1.7 cm). As shown by the 
error maps in Fig.j^e), (f), both methods fail in depth recovery 
in the face boundary regions, where very little light is reflected 
back from the scene to the single-photon detector and the 
signal-to-background ratio is thus very low. Also, we observe 
that our estimated average background level over ah pixels 
was B = 1.4 X 10“^, which is very close to the calibrated 
true background level B = 1.3x 10“^. In this experiment, we 
had M = N = 801. Also, we set 6 = 10“^ and the average 
number of iterations until convergence was measured to be 2.1 
over ah pixels. Code and data used to generate results can be 
downloaded from mi- 

VI. Conclusions and Future Work 

In this letter, we presented an imaging framework for cali- 
brationless, pixelwise depth reconstruction using single-photon 
observations. Our imaging model combined photon detection 
statistics with the discrete-time flux constraints expressed 
using a union-of-subspaces model. Then, using our imaging 
model, we developed a greedy algorithm that recovers scene 
depth by solving a constrained optimization problem. 

Our pixelwise imaging framework can be used in low light- 
level imaging applications, where the scene being imaged has 
flne features and Altering techniques that exploit patchwise 
smoothness can potentially wash out those details. For exam¬ 
ple, it can be useful in applications such as airborne remote 
sensing 0 where the aim is to recover flnely-featured 3D 
terrain maps. 

It is straightforward to generalize the proposed single¬ 
photon imaging framework to multiple-depth estimation, 
where more than one reflector may be present at each pixel. 
In the case of estimating depths of K reflectors at a pixel, 
the 1-sparsity assumption must be changed to more general 
A-sparsity assumption when deflning the union-of-subspaces 
constraint. 
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